AITopics | large-scale stochastic sampling

Large-Scale Stochastic Sampling from the Probability Simplex

Neural Information Processing SystemsNov-20-2025, 22:37:44 GMT

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space the time-discretization error can dominate when we are near the boundary of the space. We demonstrate that because of this, current SGMCMC methods for the simplex struggle with sparse simplex spaces; when many of the components are close to zero. Unfortunately, many popular large-scale Bayesian models, such as network or topic models, require inference on sparse simplex spaces. To avoid the biases caused by this discretization error, we propose the stochastic Cox-Ingersoll-Ross process (SCIR), which removes all discretization error and we prove that samples from the SCIR process are asymptotically unbiased. We discuss how this idea can be extended to target other constrained spaces. Use of the SCIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.

artificial intelligence, bayesian inference, machine learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.60)

Add feedback

Large-Scale Stochastic Sampling from the Probability Simplex

Neural Information Processing SystemsOct-8-2024, 17:49:51 GMT

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space the time-discretization error can dominate when we are near the boundary of the space. We demonstrate that because of this, current SGMCMC methods for the simplex struggle with sparse simplex spaces; when many of the components are close to zero. Unfortunately, many popular large-scale Bayesian models, such as network or topic models, require inference on sparse simplex spaces.

large-scale stochastic sampling, probability simplex, sparse simplex space, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.63)

Add feedback

Reviews: Large-Scale Stochastic Sampling from the Probability Simplex

Neural Information Processing SystemsOct-7-2024, 18:48:18 GMT

For the valuable problem of large-scale and sparse stochastic inference on simplex, the authors proposed a novel Stochastic gradient Markov chain Monte Carlo (SGMCMC) method, which is based on the Cox-Ingersoll-Ross (CIR) process. Compared with the commonly-used Langevin diffusion within the SGMCMC community, the CIR process (i) is closely related to the flexible Gamma distribution, and therefore more suitable for inferring a Dirichlet distribution on simplex, since a Dirichlet distribution is just the normalization of Gamma distributions; (ii) CIR has no discretization error, which is shown to be a clear advantage over the Langevin diffusion on simplex inference. Besides, the author proved that the proposed SCIR method is asymptotically unbiased, and has improved performance over other SGMCMC method on sparse simplex problem via two experiments, namely inferring a LDA on a dataset of scraped Wikipedia documents and inferring a Bayesian nonparametric mixture model on Microsoft user dataset. I think the quality is good; the presentation is clear; as far as I know the proposed technique is original and of great significance. Therefore I vote for acceptance. However, the experiments are okay, but not strong.

experiment, large-scale stochastic sampling, probability simplex, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Large-Scale Stochastic Sampling from the Probability Simplex

Baker, Jack, Fearnhead, Paul, Fox, Emily, Nemeth, Christopher

Neural Information Processing SystemsFeb-14-2020, 18:55:50 GMT

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space the time-discretization error can dominate when we are near the boundary of the space. We demonstrate that because of this, current SGMCMC methods for the simplex struggle with sparse simplex spaces; when many of the components are close to zero. Unfortunately, many popular large-scale Bayesian models, such as network or topic models, require inference on sparse simplex spaces.

large-scale stochastic sampling, probability simplex, sparse simplex space, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.63)

Add feedback

Filters

Collaborating Authors

large-scale stochastic sampling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Large-Scale Stochastic Sampling from the Probability Simplex

Large-Scale Stochastic Sampling from the Probability Simplex

Reviews: Large-Scale Stochastic Sampling from the Probability Simplex

Large-Scale Stochastic Sampling from the Probability Simplex